Skip to content

Consolidate experiment design-matrix attributes into xr.Dataset#849

Open
drbenvincent wants to merge 5 commits into
mainfrom
199-consolidate-xarray-datasets
Open

Consolidate experiment design-matrix attributes into xr.Dataset#849
drbenvincent wants to merge 5 commits into
mainfrom
199-consolidate-xarray-datasets

Conversation

@drbenvincent
Copy link
Copy Markdown
Collaborator

Summary

  • Bundles loose xr.DataArray attributes on 8 experiment classes into xr.Dataset objects, reducing attribute sprawl while keeping related design matrices together.
    • Pre/post classes (ITS, SC): 4 DataArrays consolidated into pre_design / post_design Datasets.
    • Formula-based classes (DiD, RD, RK, PrePostNEGD, PiecewiseITS, PanelRegression): 2 DataArrays consolidated into a single design Dataset.
  • All old attribute names preserved via deprecated @property accessors with DeprecationWarning for backward compatibility.
  • Updates reporting.py and tests to use the new Dataset access patterns internally.

Closes #199. Follow-up notebook migration tracked in #848.

Test plan

  • All 850 existing tests pass (0 failures)
  • prek run --all-files passes (lint, format, mypy, codespell, notebook validation)
  • Deprecated properties still work (notebooks and external code unaffected)

Made with Cursor

Add _predictor_data_name and _target_data_name class attributes to
PyMCModel so subclasses using non-default data node names can
customize without re-implementing _data_setter. Validate at predict
time and raise a clear ValueError if expected nodes are missing.

Also fix pre-existing mypy type: ignore codes in panel_regression.py
(attr-defined -> union-attr).

Made-with: Cursor
Remove _predictor_data_name / _target_data_name class attributes
(added complexity for a case no existing subclass needs). Keep the
validation that raises a clear ValueError when expected data nodes
are missing. Revert undeclared y_dtype behavioral change. Improve
test coverage with separate X-missing and y-missing error paths.

Made-with: Cursor
Bundle loose xr.DataArray attributes on experiment classes into
xr.Dataset objects to reduce attribute sprawl. Pre/post classes
(ITS, SC) use pre_design/post_design; formula-based classes use
a single design Dataset. Deprecated @Property accessors preserve
backward compatibility. Closes #199.

Made-with: Cursor
@github-actions github-actions Bot added the refactor Refactor, clean up, or improvement with no visible changes to the user label Apr 15, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 98.41270% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.62%. Comparing base (deb8774) to head (ba0faf5).

Files with missing lines Patch % Lines
causalpy/tests/test_pymc_models.py 93.18% 2 Missing and 1 partial ⚠️
causalpy/experiments/base.py 94.11% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #849      +/-   ##
==========================================
+ Coverage   94.60%   94.62%   +0.01%     
==========================================
  Files          80       81       +1     
  Lines       12764    12923     +159     
  Branches      770      776       +6     
==========================================
+ Hits        12076    12229     +153     
- Misses        485      488       +3     
- Partials      203      206       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator Author

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial review from gpt-5.4-xhigh.

  1. causalpy/checks/convex_hull.py still calls deprecated aliases. ConvexHullCheck.run() reads sc.datapre_control and sc.datapre_treated, which on this branch now route through the new deprecation shims. I reproduced this locally and the check emits:

    • datapre_control is deprecated, use pre_design['control']
    • datapre_treated is deprecated, use pre_design['treated']

    That means a normal synthetic-control workflow now makes CausalPy warn against itself. Immediate fix direction: do a repo-wide sweep for internal uses of the deprecated names, switch those callers to design[...] / pre_design[...] / post_design[...], and add a warning-focused test in causalpy/tests/test_cross_cutting_checks.py asserting that ConvexHullCheck.run() stays warning-free.

  2. The refactor still has a lot of duplicate migration boilerplate, and that is making the change less elegant than it could be. Right now the same migration pattern is hand-written across the experiment classes:

    • build one or two xr.Datasets with near-identical X/y or control/treated wrapping logic,
    • preserve temporary raw arrays just long enough to build those datasets,
    • add near-identical deprecated properties that warn and then forward to the new dataset-backed location.

    You can see this pattern repeated in diff_in_diff.py, panel_regression.py, piecewise_its.py, prepostnegd.py, regression_discontinuity.py, regression_kink.py, plus the split pre/post variants in interrupted_time_series.py and synthetic_control.py. The practical problem is not just aesthetics: every repeated shim and dataset wrapper is another place to miss an internal migration, drift in warning text, or leave coverage behind. The ConvexHullCheck miss feels like a direct symptom of that copy-paste surface area.

    Suggested fix direction: centralize more of this in BaseExperiment (or a small mixin/helper module) so subclasses only describe what is experiment-specific. For example:

    • one helper for the common single-design case that builds the standard {"X", "y"} dataset given raw arrays, observation index, labels, and treated-unit labels;
    • one helper for split designs so ITS-style classes can reuse the same builder for pre/post periods;
    • one shared deprecation-forwarding helper so the compatibility properties become one-liners instead of each class repeating its own warnings.warn(...); return self.design[...] block.

    I would prefer that over adding more bespoke per-class migration code. If you want to avoid magic, you do not need a dynamic __getattr__; even explicit properties backed by a shared helper would already remove most of the duplication.

  3. The backward-compatibility layer is still under-tested, and the remote results reflect that. Every non-Codecov check is green, but codecov/patch is failing at 84.71% with 37 missing lines, concentrated in the new deprecated properties across the experiment classes. Since preserving old attribute access is one of the main claims of the PR, I’d expect targeted tests that exercise those legacy aliases and assert both the warning and the data equivalence to the new dataset-backed API.

    Suggested fix direction: add a small parametrized compatibility test suite rather than lots of bespoke tests. I would cover:

    • representative single-design aliases (X, y) and split-design aliases (pre_X, pre_y, post_X, post_y);
    • the synthetic-control aliases (datapre_control, datapre_treated, datapost_control, datapost_treated);
    • equality of the returned object/data with the new dataset-backed access path;
    • correct deprecation behavior via pytest.deprecated_call() or explicit warning capture.

    If you centralize the alias metadata, you can drive these tests from the same mapping and shrink both implementation duplication and test duplication at the same time.

Net: I like the direction of moving related arrays into xr.Datasets, but I don’t think this is ready yet. The current version is functionally plausible, but not yet simple or especially elegant because the migration logic is still spread across too many classes and one real regression has already slipped through. I’d first eliminate internal uses of deprecated aliases, then centralize the migration helpers, then land this with focused compatibility coverage.

Move _build_design_dataset helper and __getattr__ deprecation forwarding
into BaseExperiment, replacing per-class @Property blocks with a
declarative _deprecated_design_aliases dict. Fix convex_hull.py and
maketables_adapters.py to use the new API directly. Add parametrized
backward-compatibility tests covering all deprecated aliases.

Made-with: Cursor
Copy link
Copy Markdown
Collaborator Author

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up adversarial review from gpt-5.4-xhigh.

No new blocking findings from me on the latest revision.

What satisfies my earlier review:

  • The duplicate migration boilerplate is now materially reduced. Moving deprecated alias forwarding into BaseExperiment.__getattr__ with a declarative _deprecated_design_aliases mapping is the kind of centralization I was asking for, and _build_design_dataset() usefully consolidates the common design["X"] / design["y"] wrapping path.
  • The self-warning regression is fixed. causalpy/checks/convex_hull.py now reads pre_design[...] directly, and causalpy/maketables_adapters.py also no longer relies on deprecated design aliases internally.
  • The backward-compatibility contract is now tested in the right way. causalpy/tests/test_deprecated_design_aliases.py checks both deprecation behavior and data identity, and it adds a warning-free test for the convex-hull path.
  • The remote results now support the implementation: codecov/patch is green, both test jobs are green, notebooks are green, docs are green, and prek is green.

I also ran the two most relevant local test targets from this follow-up:

  • pytest causalpy/tests/test_deprecated_design_aliases.py -q
  • pytest causalpy/tests/test_maketables_plugin.py -q

Both passed locally. The standalone invocations tripped the repo-wide coverage floor, which is expected for narrow pytest runs here and not a concern for the PR itself.

Residual note, not a blocker: the shared dataset helper currently covers the standard X/y layout but not the SyntheticControl control/treated layout, so there is still some experiment-specific wrapping code there. That no longer looks like problematic duplication to me; the important compatibility and warning-forwarding boilerplate is now centralized.

This satisfies my earlier objections.

@drbenvincent drbenvincent marked this pull request as ready for review April 15, 2026 21:08

_default_model_class: type[PyMCModel] | None = None

_deprecated_design_aliases: dict[str, tuple[str, str]] = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick one — InversePropensityWeighting, InstrumentalVariable, and StaggeredDiD didn't get migrated to design (still using numpy self.X / self.y, see e.g. causalpy/experiments/inverse_propensity_weighting.py:111, instrumental_variable.py:155, staggered_did.py:332-338) and accordingly don't show up in _deprecated_design_aliases here. Assuming intentional given their non-standard layouts (IPW has t + outcome, IV has Z + endogenous treatment, staggered has the train/full split)? If so, maybe worth a sentence in the PR body so it's not mistaken for an oversight.

Comment on lines +139 to +140
self.pre_design["treated"].isel(treated_units=i).values,
self.pre_design["control"].values,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads-up on the input convention: this call site converts to numpy via .values, but the new ConvexHullCheck.run in causalpy/checks/convex_hull.py:60-62 hands the same data to check_convex_hull_violation as xarray.DataArray directly (no .values). Both work — I checked end-to-end with violation cases and got identical results — but it's worth normalizing on one style across the two call sites and adding a one-liner in the helper's docstring saying it accepts numpy or xarray. Otherwise this becomes a quiet footgun the first time someone changes the helper to assume one shape.

Comment on lines +53 to +54
datapre_control = sc.pre_design["control"] # type: ignore[attr-defined]
datapre_treated = sc.pre_design["treated"] # type: ignore[attr-defined]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit: since applicable_methods = {SyntheticControl} and validate() already enforces the type, you could drop both # type: ignore[attr-defined] markers by adding assert isinstance(experiment, SyntheticControl) (or just calling self.validate(experiment)) at the top of run() — that narrows sc to SyntheticControl and pre_design becomes a known attribute for mypy. Same effect, no escape hatches.

Comment thread causalpy/pymc_models.py
for name in ("X", "y"):
if name not in self.named_vars:
raise ValueError(
f"Data node '{name}' not found in model. "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional polish: the class docstring above already points users at BayesianBasisExpansionTimeSeries as a concrete override example. Worth echoing that in the runtime message itself so users hitting this in a notebook don't have to dig — e.g. f"... override _data_setter() (see BayesianBasisExpansionTimeSeries for an example).". Pure DX nicety, not blocking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Refactor, clean up, or improvement with no visible changes to the user

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Simplify ExperimentalDesign classes by consolidating relevant properties into a xarray.Dataset

2 participants